skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Kumar, Tanishq"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We consider a toy model that exhibits grokking, recently advanced by [Kumar et al, 2023], and take advantage of the simple setting to derive the dynamics of the train and test loss using Dynamical Mean Field Theory (DMFT). This gives a closed-form expression for the gap between train and test loss that characterizes grokking in this toy model, illustrating how two parameters of interest -- NTK alignment and network laziness -- control the size of this gap and how grokking emerges as a uniquely offline property during repeated training over the same dataset. This is the first quantitative characterization of grokking dynamics in a general setting that makes no assumptions about weight decay, weight norm, etc. 
    more » « less